Techniques to facilitate a migration process to cloud storage

ABSTRACT

Techniques to facilitate a migration process of source data to cloud storage are described. These techniques use configuration information with pre-configured settings for the migration process by leveraging such information to build a component to execute the migration process. These settings can be used to identify computing modules (including interfaces) for generating a script for loading the source data to a storage location managed by a cloud storage service. The script may rely upon a data model for organizing the source data, which also is provided in the settings. Once the source data is available, the source data is converted into a suitable migration dataset and communicated with the script to the cloud storage service in a single operation. Other embodiments are described and claimed.

BACKGROUND

Organizations, including multi-national corporations, invest intechnological equipment to meet their informational technology demandsand personnel to operate such equipment. The same organizations arelooking for ways to reduce costs related to such equipment andpersonnel. This includes the purchasing of cloud storage/computingservices in terms of capabilities and/or capacities. For instance, anorganization may purchase storage space with a capacity (e.g., one (1)terabyte) and a particular data transfer rate in addition to a computenode with a capability of loading database records with JAVA code.Cloud-based services offer several advantages (in general) over anon-premises enterprise facility housing various types of storagetechnologies; valuable time and resources at that facility can be usedfor other enterprise tasks. While there are advantages, managingcloud-based services, however, can be tedious and expensive. When a fileor set of files arrives at the on-premises facility for migration to thecloud storage, the file or set of files have to be manually configuredby personnel for that migration.

It is with respect to these and other considerations that the presentimprovements have been desired.

SUMMARY

The following presents a simplified summary in order to provide a basicunderstanding of some novel embodiments described herein. This summaryis not an extensive overview, and it is not intended to identifykey/critical elements or to delineate the scope thereof. Its solepurpose is to present some concepts in a simplified form as a prelude tothe more detailed description that is presented later.

Various embodiments are generally directed to techniques to facilitate amigration process to cloud storage of source data from various clientdevices. Some embodiments are particularly directed to techniques toleverage configuration information with pre-configured settings forbuilding a component to execute the migration process. Differentcomputing modules (including interfaces) may be added or removed fromthe component as needed for a particular migration.

In one embodiment, for example, an apparatus may comprise a processingcircuit and logic stored in computer memory and executed on theprocessing circuit, the logic operative to cause the processing circuitto access configuration information associated with migrating sourcedata to cloud storage. The configuration information includes settingsdirected towards storing the source data in the cloud storage. The logicis further operative to cause the processing circuit to convert filedata into a migration dataset in accordance with the settings in theconfiguration information. The file data is to be migrated to a storagelocation over a network. The logic is further operative to cause theprocessing circuit to generate a script comprising operations forstoring the migration dataset file data in the migration dataset. Thescript being based upon the settings in the configuration information.The logic is further operative to cause the processing circuit tocommunicate the script and the migration dataset to a server associatedwith the cloud storage service, the server configured to execute thescript and store the migration dataset in the storage location inaccordance with the settings in the configuration information. Otherembodiments are described and claimed.

To the accomplishment of the foregoing and related ends, certainillustrative aspects are described herein in connection with thefollowing description and the annexed drawings. These aspects areindicative of the various ways in which the principles disclosed hereincan be practiced and all aspects and equivalents thereof are intended tobe within the scope of the claimed subject matter. Other advantages andnovel features will become apparent from the following detaileddescription when considered in conjunction with the drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates an embodiment of a system to facilitate a migrationprocess to cloud storage of source data.

FIG. 2 illustrates an embodiment of an apparatus to implement the systemof FIG. 1.

FIG. 3 illustrates embodiments of example operating environments for thesystem of FIG. 1.

FIG. 4 illustrates an embodiment of a migration process of FIG. 1.

FIG. 5 illustrates an embodiment of a centralized system for the systemof FIG. 1.

FIG. 6 illustrates an embodiment of a logic flow for the system of FIG.1.

FIG. 7 illustrates an embodiment of a computing architecture.

FIG. 8 illustrates an embodiment of a communications architecture.

DETAILED DESCRIPTION

Various embodiments are directed to facilitating a migration process ofsource data from an enterprise system to an external system hostingservices over a network. As described herein, these services includecloud-based services, such as a cloud storage service or a cloudcomputing service. The enterprise system may be concentrated in acentralized environment (i.e., on-premises facility) or a distributedenvironment (i.e., off-premises devices). When data items (i.e., filessuch as database records) are to be migrated, the embodiments describedherein operate to convert the data items into a suitable migrationdataset and then generate a script for storing the converted data itemsinto a (cloud) storage location. The present disclosure describes theseembodiments as being automatic and, in a single step, improving uponconventional techniques where the migration of any source data is spreadout over several steps and one or more human operators. As describedherein, the embodiments of the present disclosure herein refer to animproved migration process where (more or less) a single component(e.g., an executable pipeline) performs the above-mentioned source dataconversion and script generation.

It is appreciated that the present disclosure describes embodiments thatleverage configuration information to enable the improved migrationprocess. In the various embodiments, carefully designed configurationinformation prepares a system to automatically migrate any source datato cloud storage. A portion of the configuration information includesvarious settings related to a cloud storage service. Having the varioussettings beforehand, the system of the present disclosure mayautomatically convert the data items into the migration dataset andgenerate a script for loading the converted data items of the migrationdataset. The script may instruct the cloud storage service on storingthe convert data items based upon a data model in operation at a storagelocation for the migration dataset.

To illustrate by way of example, when a file (e.g., a media file or adatabase record) or a set of files arrives at an on-premises facilityfor migration to the cloud storage service, the system described hereinaccesses the settings in the configuration information, converts thefile or set of files into a migration dataset, and communicates a scriptto the cloud storage service. It is appreciated that in someembodiments, the script may be deployable from the cloud storageservice, eliminating any requirement that the system generates andcommunicates the script. In other embodiments, the system invokes anexecutable pipeline that is pre-configured with modules corresponding tothe settings, eliminating any requirement that the system access thesettings.

It is further appreciated that prior to the present disclosure, the fileor set of files would have had to be manually configured by personnelfor that migration. Often, a team of professionals must define varioussettings for the migration process and then enter those settings eachtime a file is to be migrated. More importantly, the team operates in astep-by-step process such that some settings are entered consecutively.

With general reference to notations and nomenclature used herein, thedetailed descriptions which follow may be presented in terms of programprocesses executed on a computer or network of computers. These processdescriptions and representations are used by those skilled in the art tomost effectively convey the substance of their work to others skilled inthe art.

A process is here, and generally, conceived to be a self-consistentsequence of operations leading to the desired result. These operationsare those requiring physical manipulations of physical quantities.Usually, though not necessarily, these quantities take the form ofelectrical, magnetic, or optical signals capable of being stored,transferred, combined, compared, and otherwise manipulated. It provesconvenient at times, principally for reasons of common usage, to referto these signals as bits, values, elements, symbols, characters, terms,numbers, or the like. It should be noted, however, that all of these andsimilar terms are to be associated with the appropriate physicalquantities and are merely convenient labels applied to those quantities.

Further, the manipulations performed are often referred to in terms,such as adding or comparing, which are commonly associated with mentaloperations performed by a human operator. No such capability of a humanoperator is necessary, or desirable in most cases, in any of theoperations described herein which form part of one or more embodiments.Rather, the operations are machine operations. Useful machines forperforming operations of various embodiments include general-purposedigital computers or similar devices.

Various embodiments also relate to apparatus or systems for performingthese operations. This apparatus may be specially constructed for therequired purpose, or it may comprise a general-purpose computer asselectively activated or reconfigured by a computer program stored inthe computer. The processes presented herein are not inherently relatedto a particular computer or other apparatus. Various general-purposemachines may be used with programs written in accordance with theteachings herein, or it may prove convenient to construct morespecialized apparatus to perform the required method steps. The requiredstructure for a variety of these machines will appear from thedescription given.

Reference is now made to the drawings, wherein like reference numeralsare used to refer to like elements throughout. In the followingdescription, for purposes of explanation, numerous specific details areset forth in order to provide a thorough understanding thereof. It maybe evident, however, that the novel embodiments can be practiced withoutthese specific details. In other instances, well-known structures anddevices are shown in block diagram form in order to facilitate adescription thereof. The intention is to cover all modifications,equivalents, and alternatives consistent with the claimed subjectmatter.

FIG. 1 illustrates a block diagram for a system 100. In one embodiment,the system 100 may comprise a computer-implemented system 100 having asoftware application 120 comprising one or more components 122-a.Although the system 100 shown in FIG. 1 has a limited number of elementsin a certain topology; it may be appreciated that the system 100 mayinclude more or less elements in alternate topologies as desired for agiven implementation.

It is worthy to note that “a” and “b” and “c” and similar designators asused herein are intended to be variables representing any positiveinteger. Thus, for example, if an implementation sets a value for a=5,then a complete set of components 122-a may include components 122-4,122-5, 122-6, 122-7, and 122-8. The embodiments are not limited in thiscontext.

The system 100 may comprise the application 120. As mentioned above, thesystem 100 includes the application 120 as a type of softwareapplication running on an electronic device, such as a desktopapplication running on an operating system of a computing device, amobile application running on a mobile operating system of a mobiledevice, or a web application running on a browser component of eitherthe mobile operating system or the desktop operating system. Thoseskilled in the art would understand how to design, build, and deploy thesoftware application on any type of electronic device.

The application 120 may be generally arranged to process input 110, ofwhich some input may be provided directly to an interface component122-1 via an input device, and other input may be provided to theinterface component 122-1 via a network. For example, a user may enterdata via a keyboard device attached to a computing device running theapplication 120. The application 120 may be generally arranged togenerate output 130 for the interface component 122-1 of which someoutput may be configured for display on a display device, and otheroutput may be communicated across the network to other devices. As anexample, the application 120 may generate data that can beprocessed/rendered by the interface component 122-1 into content for aGraphical User Interface (GUI).

The application 120 may be generally arranged to provide a practicalimprovement by way of a settings component 122-2 and a migrationcomponent 122-3. As described herein, the migration component 122-3 mayrepresent a single component for executing a migration processautomatically for a set of files to cloud storage. An example embodimentof the migration component 122-3 is an executable pipeline of modulesthat are executed (in an ordering) to complete the migration processsuch that the set of files is maintained by a cloud storage service. Asdescribed herein, the executable pipeline includes a set of modules thatin general secure the set of files from misappropriation (e.g., viatokenization and/or encryption), ensure proper formatting and/orencoding, and provide a proper interface for loading the set of filesinto a storage location at the cloud storage service.

The application 120 may further comprise the settings component 122-2.The settings component 122-2 may be generally arranged to accept userinput and generate configuration information (i.e., settings) for theabove-mentioned migration process for source data to the storagelocation in the cloud storage. The present disclosure envisions theconfiguration information as encompassing any data related tofacilitating the above-mentioned migration process; it is appreciatedthat the present disclosure does not foreclose on any particular area ofthe migration process to configure by way of such configurationinformation. The configuration information generally includes settingsfor modifying files in preparation for the migration process. Examplesettings include but are not limited to an encoding format, a fileformat, a tokenization parameter, an encryption scheme, and/or acompression scheme. Other example settings include a network address andother identification information for the cloud storage servicemaintaining the storage location for the modified files. As analternative, the configuration information may identify a proxy computerfor the cloud storage service. Other example settings may refer tocontrol directives on scripting the migration process.

The migration component 122-3 may be generally arranged to process theconfiguration information mentioned above. In some embodiments, themigration component 122-3 is configured to use that information toconvert the file data into a migration dataset suitable for the cloudstorage service, generate a script to load the modified file data intothe storage location, and communicate that script to the cloud storageservice for execution. The script may be compatible with an electronicdevice running the application 120 and/or with a dedicated proxycomputer residing between the electronic device and the cloud storageservice.

In some embodiments, the settings component 122-2 may build themigration component 122-3 using the above-mentioned configurationinformation. In this manner, the settings component 122-2 combines a setof computing modules into the migration process where some modulesperform tasks converting the file data/the modified file data, and somemodules perform tasks generating/communicating a script to the cloudstorage service. The set of computing modules can be invoked as a groupor an executable pipeline when needed (i.e., dynamically). As such, themigration component 122-3 is ready for deployment when the files arrivefor migration to the cloud storage service. A user may drag-and-drop afile to migrate into an icon corresponding to the migration process, andthe user's device can execute the migration component to complete themigration process.

In some embodiments, the migration component 122-3 implements themigration process as a method having a step of accessing theconfiguration information associated with migrating source data to thecloud storage, the configuration information comprising settingsdirected towards storing the source data in the cloud storage. Themigration component 122-3 performs a step of converting the file datainto a migration dataset in accordance with the settings in theconfiguration information and applying a tokenization mechanism to aportion of the file data having target data, the portion beingidentified in the settings in the configuration information. Inembodiments, the system may convert the file data in response toreceiving the file data over the network. The tokenization mechanism isan example of the above-mentioned computing module and, when executed,replaces the target data with a token value (e.g., a Bank AccountValue). The configuration information may include byte addresses (e.g.,offsets) for the target data. A byte address may identify a data fieldor another content item having the target data. The configurationinformation may also include a file name, a file type, etc. associatedwith the target data to tokenize.

The migration component 122-3 performs a step of generating a scripthaving operations for storing the migration dataset based upon thesettings in the configuration information. As described herein, themigration component 122-3 may provide an interface through which theuser may provide a script having instructions written in a scriptinglanguage (e.g., ANSIBLE® Playbooks™, Python™, Ruby™, Shell™ PowerShell™,and/or the like). The migration component 122-3 may communicate thescript in a control directive to a server managing the cloud storageservice. The server, which invokes an associated cloud computing serviceto execute the script, may be associated with a network address providedin the settings of the configuration information. In some alternativeembodiments, the cloud computing service may execute the tokenizationmechanism. In other alternative embodiments, the migration component122-3 may provide an interface that processes code in a non-scriptinglanguage that operates similarly as the script. The cloud computingservice is configured to store the migration dataset in the storagelocation in accordance with the settings in the configurationinformation.

FIG. 2 illustrates an embodiment of an apparatus 200 for the system 100.As shown in FIG. 2, file data 210 originating from a source 215 isprocessed by an electronic device 220 having computer memory 230 and aprocessing circuit 240. It is appreciated that the file data 210includes files of various file formats and/or encoding formats. Thecomputer memory 230 stores logic 250 that, when executed in theprocessing circuit 240, provides services related to migrating that filedata 210 to cloud storage. Some files are database records for additionto a database system being stored in cloud storage.

Execution of the logic 250, when stored in the computer memory 230, isoperative to cause the processing circuit 240 to access configurationinformation 260 associated with migrating any source data to the cloudstorage. The configuration information 260 includes settings directedtowards migrating and storing the file data 210 to a storage location inthe cloud storage. Some settings include an encoding format, a fileformat, etc. for individual files in the file data 210. Furthermore, theconfiguration information 260 includes at least an address (e.g., anetwork address, such as a uniform resource locator (URL)) directing thelogic 250 to the storage location or at least a device managing thestorage location. The configuration information 260 further includesdata and control inputs for securing the file data 210 during themigration process. For example, the configuration information 260 mayspecify which tokenization process 270 to use and identify whichportion(s) of the file data 210 is to be tokenized. With respect to thetokenization process 270, the present disclosure is referring to anyprocess that replaces data with a unique identifier known as a token(e.g., a Bank Account Number (BAN)). The configuration information 260may also specify which encryption scheme to use in encrypting the filedata 210. It is appreciated that the present disclosure does not limitthe configuration information 260 to any particular combination ofsettings and, therefore, covers any settings corresponding to themigration process to the cloud storage. As described herein, some ofthese settings are used as inputs for variables in a script 280 forstoring secured file data 210 in the storage location in the cloudstorage.

The logic 250 is configured to cause the processing circuit 240 toconvert the file data 210 into a migration dataset 290 in accordancewith the settings in the configuration information 260. The logic 250 isconfigured to use the settings to modify portions of the file data 210to ensure that the script 280 successfully loads the modified file data210 into the cloud storage. The storage location, in some instances, mayrequire the proper encoding format and/or file format. In addition toidentifying a proper encoding format, the configuration information 260may also include a proper file format for converting the file data 210into the migration dataset 290.

The configuration information 260 includes additional information toensure a successful migration. As an example, the configurationinformation 260 includes a network address for the storage location intowhich the file data 210 is to be migrated over a network. As anotherexample, the configuration information 260 may describe a data modelidentifying a structure of the file data 210, including whether the filedata 210 includes database records or other datasets. In someembodiments, the structure of the file data 210 may refer to anarrangement of attributes in a database record. By way of the datamodel, the configuration information 260 may delineate byte addresseswhere target data is stored in the migration dataset 290. The script maybe generated to include instructions where whole database records orindividual data attributes are loaded into corresponding byte addressesin the storage location. It is appreciated that the corresponding byteaddresses are logical addresses that are translated by the cloud storageservice into physical addresses on a physical storage device.

The logic 250 is configured to cause the processing circuit 240 to applythe tokenization process 270 to a portion or portions of the migrationdataset 290 having the target data. The logic 250 instructs thetokenization process 270 to replace the target data at the byte addresswith a token value, such as a Bank Account Number (BAN). In someembodiments, the target data may include one or more data fields in aset of database records.

The logic 250 is configured to cause the processing circuit 240 togenerate the script 280 to include operations for storing the migrationdataset 290 in accordance with the settings of the configurationinformation 260. In some embodiments, the logic 250 may insert into thescript 280 the network address of the storage location such that thescript 280 communicates the migration dataset 290 to an appropriateexternal system running cloud storage supporting services.

The logic 250 is configured to cause the processing circuit 240 tocommunicate the script 280 and the migration dataset 290 to a server295. In some embodiments, the server 295 is an external computer systemrunning cloud storage supporting services and corresponding to thenetwork address of the storage location. Communications directed towardthat network address are routed over the network to the server. Theserver 295 may be configured to execute the script 280 and store themigration dataset 290 in the storage location in accordance with thesettings in the configuration information 260. Operations defined withinthe script 280 may direct the server 295 to load individual files orrecords into a file system or a database system, respectively. As analternative, a dedicated proxy computer may reside between the device220 and the server 295, and that dedicated proxy computer is configuredto receive and execute the script 280 and then store the migrationdataset 290 at byte addresses in the cloud storage.

FIG. 3 illustrates embodiments of an example operating environments forthe system 100. As shown in FIG. 3, an operating environment 300illustrates a migration process for database loads, and an operatingenvironment 350 illustrates a migration process for other datasets.

In the operating environment 300, source data (e.g., off-premises dataor on-premises data) is initially encrypted via PGP technology and thencommunicated to a server running a cloud storage service (e.g., AMAZON®S3™). It is appreciated that such a cloud storage service may bereferred to as a web service due, in part, to the service'simplementation of various Internet and web technologies. To illustrateby way of an example, the server running the cloud storage service and adevice requesting and configuring the cloud storage service may exchangedata via a web interface, which may be a web browser application or anyapplication capable of an Internet connection with the server. It isappreciated that the above-mentioned device may be different fromdevices providing the source data (i.e., off-premises data). The serverrunning the cloud storage service may utilize a cloud computing service(e.g., AMAZON® EC2™) for various tasks, such as tokenization. In someembodiments, a server running the cloud computing service may distributethe execution of the various tasks across a plurality of computing nodes(e.g., virtual machines). After completing the tokenization of thesource data, the server running the cloud computing service may executea script to load records into a destination database. The operatingenvironment 300 implements the destination database as a MICROSOFT®AZURE® PostGRESQL™ database.

In the operating environment 350, the system 100 encrypts source data(e.g., on-premises data) using an encryption scheme. The operatingenvironment 350 implements an accelerator device, known as a Snowball,to accelerate data transfer of the source data to a storage location inthe cloud storage service. The Snowball, in general now, provides aservice that accelerates transferring large amounts of the source datainto and out of the cloud storage service using physical storagedevices. This may be done by shipping the source data in the physicalstorage devices through a regional carrier, bypassing the Internet. EachSnowball device may transport data at faster-than internet speeds.

The server running the cloud storage service utilizes the cloudcomputing service to tokenize portions of the source data. Theseportions include some type of target data for the system 100. The targetdata may include sensitive or private information such as a socialsecurity number or an identifying photo. The cloud computing service, asinstructed by the cloud storage service, replaces the target data with atoken value, such as a Bank Account Number. After completing thetokenization process, the cloud storage service via the REST™ API into atarget location in the cloud storage's infrastructure.

FIG. 4 illustrates an embodiment of a migration process 400 for thesystem 100. As shown in FIG. 4, an executable pipeline 402 of modulesmay implement the migration process 400 for a particular operatingenvironment. FIG. 4 further illustrates a modular aspect of themigration process 402 in that the executable pipeline 402 can be adaptedin accordance with settings defined by one or more users. These settingscan be consolidated into configuration information (e.g., theconfiguration information 260 of FIG. 2).

There are a considerable number of combinations of modules for couplingwith the executable pipeline 402; as an example, FIG. 4 illustrates theexecutable pipeline 402 as invoking a converter, a database interface,and a tokenization process. The converter is a component capable ofmodifying an encoding format and/or a file format for any source data.The converter provides an API through which the executable pipeline 402can provide source data to modify into converted source data. Thedatabase interface provides functionality for executing software codeoperative to load the converted source data into a database systemand/or access any data in that database system. Java DatabaseConnectivity (JDBC), an example database interface, is an applicationprogramming interface (API) for the programming language Java. Thetokenization process (e.g., Turing 3.0) replaces target data (e.g.,sensitive information) in the converted source data with token data.With addresses for data fields having the target data, one exampleembodiment of the tokenization process replaces data stored in thosedata fields with a token value.

FIG. 5 illustrates a block diagram of a centralized environment 500. Thecentralized environment 500 may implement some or all of the structureand/or operations for the system 100 in a single computing entity, suchas entirely within a single device 520.

The device 520 may comprise any electronic device capable of receiving,processing, and sending information for the system 100. Examples of anelectronic device may include without limitation an ultra-mobile device,a mobile device, a personal digital assistant (PDA), a mobile computingdevice, a smart phone, a telephone, a digital telephone, a cellulartelephone, ebook readers, a handset, a one-way pager, a two-way pager, amessaging device, a computer, a personal computer (PC), a desktopcomputer, a laptop computer, a notebook computer, a netbook computer, ahandheld computer, a tablet computer, a server, a server array or serverfarm, a web server, a network server, an Internet server, a workstation, a mini-computer, a main frame computer, a supercomputer, anetwork appliance, a web appliance, a distributed computing system,multiprocessor systems, processor-based systems, consumer electronics,programmable consumer electronics, game devices, television, digitaltelevision, set top box, wireless access point, base station, subscriberstation, mobile subscriber center, radio network controller, router,hub, gateway, bridge, switch, machine, or combination thereof. Theembodiments are not limited in this context.

The device 520 may execute processing operations or logic for the system100 using a processing component 530. The processing component 530 maycomprise various hardware elements, software elements, or a combinationof both. Examples of hardware elements may include devices, logicdevices, components, processors, microprocessors, circuits, processorcircuits, circuit elements (e.g., transistors, resistors, capacitors,inductors, and so forth), integrated circuits, application-specificintegrated circuits (ASIC), programmable logic devices (PLD), digitalsignal processors (DSP), field-programmable gate array (FPGA),Application-specific Standard Products (ASSPs), System-on-a-chip systems(SOCs), Complex Programmable Logic Devices (CPLDs), memory units, logicgates, registers, semiconductor device, chips, microchips, chipsets, andso forth. Examples of software elements may include software components,programs, applications, computer programs, application programs, systemprograms, software development programs, machine programs, operatingsystem software, middleware, firmware, software modules, routines,subroutines, functions, methods, procedures, processes, softwareinterfaces, application program interfaces (API), instruction sets,computing code, computer code, code segments, computer code segments,words, values, symbols, or any combination thereof. Determining whetheran embodiment is implemented using hardware elements and/or softwareelements may vary in accordance with any number of factors, such asdesired computational rate, power levels, heat tolerances, processingcycle budget, input data rates, output data rates, memory resources,data bus speeds and other design or performance constraints, as desiredfor a given implementation.

The device 520 may execute communications operations or logic for thesystem 100 using communications component 540. The communicationscomponent 540 may implement any well-known communications techniques andprotocols, such as techniques suitable for use with packet-switchednetworks (e.g., public networks such as the Internet, private networkssuch as an enterprise intranet, and so forth), circuit-switched networks(e.g., the public switched telephone network), or a combination ofpacket-switched networks and circuit-switched networks (with suitablegateways and translators). The communications component 540 may includevarious types of standard communication elements, such as one or morecommunications interfaces, network interfaces, network interface cards(NIC), radios, wireless transmitters/receivers (transceivers), wiredand/or wireless communication media, physical connectors, and so forth.By way of example, and not limitation, communication media 512, 542include wired communications media and wireless communications media.Examples of wired communications media may include a wire, cable, metalleads, printed circuit boards (PCB), backplanes, switch fabrics,semiconductor material, twisted-pair wire, co-axial cable, fiber optics,a propagated signal, and so forth. Examples of wireless communicationsmedia may include acoustic, radio-frequency (RF) spectrum, infrared andother wireless media.

The device 520 may communicate with other devices 510, 550 over acommunications media 512, 542, respectively, using communicationssignals 514, 544, respectively, via the communications component 540.The devices 510, 550 may be internal or external to the device 520 asdesired for a given implementation.

As an alternative, a distributed environment distributes portions of thestructure and/or operations for the system 100 across multiple computingentities. Examples of a distributed environment may include withoutlimitation a client-server architecture, a 3-tier architecture, anN-tier architecture, a tightly-coupled or clustered architecture, apeer-to-peer architecture, a master-slave architecture, a shareddatabase architecture, and other types of distributed systems. Theembodiments are not limited in this context.

Included herein is a set of flow charts representative of exemplarymethodologies for performing novel aspects of the disclosedarchitecture. While, for purposes of simplicity of explanation, the oneor more methodologies shown herein, for example, in the form of a flowchart or flow diagram, are shown and described as a series of acts, itis to be understood and appreciated that the methodologies are notlimited by the order of acts, as some acts may, in accordance therewith,occur in a different order and/or concurrently with other acts from thatshown and described herein. For example, those skilled in the art willunderstand and appreciate that a methodology could alternatively berepresented as a series of interrelated states or events, such as in astate diagram. Moreover, not all acts illustrated in a methodology maybe required for a novel implementation.

FIG. 6 illustrates one embodiment of a logic flow 600. The logic flow600 may be representative of some or all of the operations executed byone or more embodiments described herein.

In the illustrated embodiment shown in FIG. 6, the logic flow 600 toaccess configuration information for a migration process at block 602.In general, the configuration information includes settings on preparingsource data for the migration process. In some embodiments, one or moreexample settings identify which modules to use in building or generatingan executable pipeline for this migration process and future migrationprocesses. As described herein, the executable pipeline refers to adynamic set of modules representing distinct sub-processes that whenexecuted in some order, form the migration process. The system may addcomputing modules to the executable pipeline or remove computing modulesto the executable pipeline to perform the migration process. Some ofthese sub-processes secure the source data from misappropriation, suchas a tokenization mechanism, an encryption scheme, and/or the like.Other modules provide APIs to prepare (e.g., convert) the source datainto a migration dataset suitable for the migration process. One examplemodule provides an interface that is operative to transformuser-submitted JAVA code into a script for loading the migration datasetand completing the migration process.

In some embodiments, some example settings of the configurationinformation describe instructions on modifying source data, enabling themigration process of that source data to cloud storage. As describedherein, the logic flow 600 may access settings indicating a networkaddress (e.g., an IP address) of a cloud storage service that isconfigured for administrating the actual storage of the source data. Thesettings may also include a data model describing how to arrange dataitems in the source data such that an individual data item can beidentified. The data model, in general, describes the source data'sstructure in terms of composite data items and their data types. If thedata model describes a database, the data model identifies rows/columnsin a database table as well as any indices. Each row may refer to adatabase record comprising a set of data items of which each data itemis addressable as a byte offset from a beginning of the database record.If the data model describes a file system, the data model identifieseach folder and any data files within each folder.

The logic flow 600 may convert file data in accordance with the settingsat block 604. It is appreciated that the logic flow 600 may perform theconversion in a distributed environment or a centralized environment. Toillustrate by way of examples, the file data may be a set of files thatare generated by a plurality of users and transmitted to a central,on-premises facility and scheduled for migration some time afterreception. In another example, the file data may be a set of files thatare maintained by a plurality of users at their devices and then,scheduled for migration (from the users' devices) at a fixed time anddate.

In some embodiments, the logic flow 600 may convert the file data intoan appropriate encoding format and/or file format for the cloud storage.Data items in the file data may be structured in accordance with a datamodel (e.g., a relational database model). The settings may identifybyte addresses where target data for tokenization is stored. Asdescribed herein, the target data may include sensitive or privateinformation. These byte addresses may refer to data fields (in databaserecords) or data items (in other datasets) having the sensitive orprivate information. In some embodiments, the logic flow 600 may executethe tokenization process at an on-premises facility while, in otherembodiments, the logic flow 600 may instruct a server hosting a cloudcomputing service to replace a portion of the file data with a tokenvalue.

The logic flow 600 may generate a script in accordance with the settingsat block 606. For example, the logic flow 600 may run the script fromthe server hosting the cloud computing service to load the migrationdataset into the storage location. The script may be configured in avariety of scripting languages (e.g., PowerShell). The settings in theconfiguration information describe the storage location at least interms of physical and logical addresses. The settings may include anetwork address for the cloud computing service and/or a physicalstorage device corresponding to the storage location. The settings mayalso describe a data model in operation at the storage location; thedata model introduces a logical addressing scheme for identifying storeddata within the physical storage device.

If the migration dataset includes database records, the logic flow 600inserts into the script database commands (e.g., SQL commands) that areconfigured to load these database records into a destination database.These database commands comply with a data model in operation at thedestination database. If the migration dataset includes data items otherthan a type of database record, the logic flow 600 inserts into thescript instructions that are configured to store these data items into a(destination) volume. Similar to approach used for the database records,the logic flow 600 generates the above-mentioned instructions to complywith a data model (e.g., a file system) in operation at the volume.

The logic flow 600 may communicate the script to a server or a dedicatedproxy computer at block 608. The server described herein refers to acomputer hosting the cloud computing service. For example, the logicflow 600 may invoke functionality through an interface to the cloudcomputing service where the invocation of such functionality instructsthe server to run the script. The embodiments are not limited to thisexample.

FIG. 7 illustrates an embodiment of an exemplary computing architecture700 suitable for implementing various embodiments as previouslydescribed. In one embodiment, the computing architecture 700 maycomprise or be implemented as part of an electronic device. Examples ofan electronic device may include those described with reference to FIG.8, among others. The embodiments are not limited in this context.

As used in this application, the terms “system” and “component” areintended to refer to a computer-related entity, either hardware, acombination of hardware and software, software, or software inexecution, examples of which are provided by the exemplary computingarchitecture 700. For example, a component can be, but is not limited tobeing, a process running on a processor, a processor, a hard disk drive,multiple storage drives (of optical and/or magnetic storage medium), anobject, an executable, a thread of execution, a program, and/or acomputer. By way of illustration, both an application running on aserver and the server can be a component. One or more components canreside within a process and/or thread of execution, and a component canbe localized on one computer and/or distributed between two or morecomputers. Further, components may be communicatively coupled to eachother by various types of communications media to coordinate operations.The coordination may involve the uni-directional or bi-directionalexchange of information. For instance, the components may communicateinformation in the form of signals communicated over the communicationsmedia. The information can be implemented as signals allocated tovarious signal lines. In such allocations, each message is a signal.Further embodiments, however, may alternatively employ data messages.Such data messages may be sent across various connections. Exemplaryconnections include parallel interfaces, serial interfaces, and businterfaces.

The computing architecture 700 includes various common computingelements, such as one or more processors, multi-core processors,co-processors, memory units, chipsets, controllers, peripherals,interfaces, oscillators, timing devices, video cards, audio cards,multimedia input/output (I/O) components, power supplies, and so forth.The embodiments, however, are not limited to implementation by thecomputing architecture 700.

As shown in FIG. 7, the computing architecture 700 comprises aprocessing unit 704, a system memory 706 and a system bus 708. Theprocessing unit 704 can be any of various commercially availableprocessors, including without limitation an AMD® Athlon®, Duron® andOpteron® processors; ARM® application, embedded and secure processors;IBM® and Motorola® DragonBall® and PowerPC® processors; IBM and Sony®Cell processors; Intel® Celeron®, Core (2) Duo®, Itanium®, Pentium®,Xeon®, and XScale® processors; and similar processors. Dualmicroprocessors, multi-core processors, and other multi-processorarchitectures may also be employed as the processing unit 704.

The system bus 708 provides an interface for system componentsincluding, but not limited to, the system memory 706 to the processingunit 704. The system bus 708 can be any of several types of busstructure that may further interconnect to a memory bus (with or withouta memory controller), a peripheral bus, and a local bus using any of avariety of commercially available bus architectures. Interface adaptersmay connect to the system bus 708 via a slot architecture. Example slotarchitectures may include without limitation Accelerated Graphics Port(AGP), Card Bus, (Extended) Industry Standard Architecture ((E)ISA),Micro Channel Architecture (MCA), NuBus, Peripheral ComponentInterconnect (Extended) (PCI(X)), PCI Express, Personal Computer MemoryCard International Association (PCMCIA), and the like.

The computing architecture 700 may comprise or implement variousarticles of manufacture. An article of manufacture may comprise acomputer-readable storage medium to store logic. Examples of acomputer-readable storage medium may include any tangible media capableof storing electronic data, including volatile memory or non-volatilememory, removable or non-removable memory, erasable or non-erasablememory, writeable or re-writeable memory, and so forth. Examples oflogic may include executable computer program instructions implementedusing any suitable type of code, such as source code, compiled code,interpreted code, executable code, static code, dynamic code,object-oriented code, visual code, and the like. Embodiments may also beat least partly implemented as instructions contained in or on anon-transitory computer-readable medium, which may be read and executedby one or more processors to enable performance of the operationsdescribed herein.

The system memory 706 may include various types of computer-readablestorage media in the form of one or more higher speed memory units, suchas read-only memory (ROM), random-access memory (RAM), dynamic RAM(DRAM), Double-Data-Rate DRAM (DDRAM), synchronous DRAM (SDRAM), staticRAM (SRAM), programmable ROM (PROM), erasable programmable ROM (EPROM),electrically erasable programmable ROM (EEPROM), flash memory, polymermemory such as ferroelectric polymer memory, ovonic memory, phase changeor ferroelectric memory, silicon-oxide-nitride-oxide-silicon (SONOS)memory, magnetic or optical cards, an array of devices such as RedundantArray of Independent Disks (RAID) drives, solid state memory devices(e.g., USB memory, solid state drives (SSD) and any other type ofstorage media suitable for storing information. In the illustratedembodiment shown in FIG. 7, the system memory 706 can includenon-volatile memory 710 and/or volatile memory 712. A basic input/outputsystem (BIOS) can be stored in the non-volatile memory 710.

The computer 702 may include various types of computer-readable storagemedia in the form of one or more lower speed memory units, including aninternal (or external) hard disk drive (HDD) 714, a magnetic floppy diskdrive (FDD) 716 to read from or write to a removable magnetic disk 718,and an optical disk drive 720 to read from or write to a removableoptical disk 722 (e.g., a CD-ROM or DVD). The HDD 714, FDD 716 andoptical disk drive 720 can be connected to the system bus 708 by a HDDinterface 724, an FDD interface 726 and an optical drive interface 728,respectively. The HDD interface 724 for external drive implementationscan include at least one or both of Universal Serial Bus (USB) andInstitute of Electrical and Electronics Engineers (IEEE) 1394 interfacetechnologies.

The drives and associated computer-readable media provide volatileand/or nonvolatile storage of data, data structures, computer-executableinstructions, and so forth. For example, a number of program modules canbe stored in the drives and memory units 710, 712, including anoperating system 730, one or more application programs 732, otherprogram modules 734, and program data 736. In one embodiment, the one ormore application programs 732, other program modules 734, and programdata 736 can include, for example, the various applications and/orcomponents of the system 100.

A user can enter commands and information into the computer 702 throughone or more wire/wireless input devices, for example, a keyboard 738 anda pointing device, such as a mouse 740. Other input devices may includemicrophones, infra-red (IR) remote controls, radio-frequency (RF) remotecontrols, game pads, stylus pens, card readers, dongles, finger printreaders, gloves, graphics tablets, joysticks, keyboards, retina readers,touch screens (e.g., capacitive, resistive, etc.), trackballs,trackpads, sensors, styluses, and the like. These and other inputdevices are often connected to the processing unit 704 through an inputdevice interface 742 that is coupled to the system bus 708, but can beconnected by other interfaces such as a parallel port, IEEE 1394 serialport, a game port, a USB port, an IR interface, and so forth.

A monitor 744 or other type of display device is also connected to thesystem bus 708 via an interface, such as a video adaptor 746. Themonitor 744 may be internal or external to the computer 702. In additionto the monitor 744, a computer typically includes other peripheraloutput devices, such as speakers, printers, and so forth.

The computer 702 may operate in a networked environment using logicalconnections via wire and/or wireless communications to one or moreremote computers, such as a remote computer 748. The remote computer 748can be a workstation, a server computer, a router, a personal computer,portable computer, microprocessor-based entertainment appliance, a peerdevice or other common network node, and typically includes many or allof the elements described relative to the computer 702, although, forpurposes of brevity, only a memory/storage device 750 is illustrated.The logical connections depicted include wire/wireless connectivity to alocal area network (LAN) 752 and/or larger networks, for example, a widearea network (WAN) 754. Such LAN and WAN networking environments arecommonplace in offices and companies, and facilitate enterprise-widecomputer networks, such as intranets, all of which may connect to aglobal communications network, for example, the Internet.

When used in a LAN networking environment, the computer 702 is connectedto the LAN 752 through a wire and/or wireless communication networkinterface or adaptor 756. The adaptor 756 can facilitate wire and/orwireless communications to the LAN 752, which may also include awireless access point disposed thereon for communicating with thewireless functionality of the adaptor 756.

When used in a WAN networking environment, the computer 702 can includea modem 758, or is connected to a communications server on the WAN 754,or has other means for establishing communications over the WAN 754,such as by way of the Internet. The modem 758, which can be internal orexternal and a wire and/or wireless device, connects to the system bus708 via the input device interface 742. In a networked environment,program modules depicted relative to the computer 702, or portionsthereof, can be stored in the remote memory/storage device 750. It willbe appreciated that the network connections shown are exemplary andother means of establishing a communications link between the computerscan be used.

The computer 702 is operable to communicate with wire and wirelessdevices or entities using the IEEE 802 family of standards, such aswireless devices operatively disposed in wireless communication (e.g.,IEEE 802.11 over-the-air modulation techniques). This includes at leastWi-Fi (or Wireless Fidelity), WiMax, and Bluetooth™ wirelesstechnologies, among others. Thus, the communication can be a predefinedstructure as with a conventional network or simply an ad hoccommunication between at least two devices. Wi-Fi networks use radiotechnologies called IEEE 802.11x (a, b, g, n, etc.) to provide secure,reliable, fast wireless connectivity. A Wi-Fi network can be used toconnect computers to each other, to the Internet, and to wire networks(which use IEEE 802.3-related media and functions).

FIG. 8 illustrates a block diagram of an exemplary communicationsarchitecture 800 suitable for implementing various embodiments aspreviously described. The communications architecture 800 includesvarious common communications elements, such as a transmitter, receiver,transceiver, radio, network interface, baseband processor, antenna,amplifiers, filters, power supplies, and so forth. The embodiments,however, are not limited to implementation by the communicationsarchitecture 800.

As shown in FIG. 8, the communications architecture 800 comprisesincludes one or more clients 802 and servers 804. The clients 802 mayimplement the client device 910. The servers 804 may implement theserver device 950. The clients 802 and the servers 804 are operativelyconnected to one or more respective client data stores 808 and serverdata stores 810 that can be employed to store information local to therespective clients 802 and servers 804, such as cookies and/orassociated contextual information.

The clients 802 and the servers 804 may communicate information betweeneach other using a communication framework 806. The communicationsframework 806 may implement any well-known communications techniques andprotocols. The communications framework 806 may be implemented as apacket-switched network (e.g., public networks such as the Internet,private networks such as an enterprise intranet, and so forth), acircuit-switched network (e.g., the public switched telephone network),or a combination of a packet-switched network and a circuit-switchednetwork (with suitable gateways and translators).

The communications framework 806 may implement various networkinterfaces arranged to accept, communicate, and connect to acommunications network. A network interface may be regarded as aspecialized form of an input output interface. Network interfaces mayemploy connection protocols including without limitation direct connect,Ethernet (e.g., thick, thin, twisted pair 10/100/1000 Base T, and thelike), token ring, wireless network interfaces, cellular networkinterfaces, IEEE 802.11a-x network interfaces, IEEE 802.16 networkinterfaces, IEEE 802.20 network interfaces, and the like. Further,multiple network interfaces may be used to engage with variouscommunications network types. For example, multiple network interfacesmay be employed to allow for the communication over broadcast,multicast, and unicast networks. Should processing requirements dictatea greater amount speed and capacity, distributed network controllerarchitectures may similarly be employed to pool, load balance, andotherwise increase the communicative bandwidth required by clients 802and the servers 804. A communications network may be any one and thecombination of wired and/or wireless networks including withoutlimitation a direct interconnection, a secured custom connection, aprivate network (e.g., an enterprise intranet), a public network (e.g.,the Internet), a Personal Area Network (PAN), a Local Area Network(LAN), a Metropolitan Area Network (MAN), an Operating Missions as Nodeson the Internet (OMNI), a Wide Area Network (WAN), a wireless network, acellular network, and other communications networks.

Some embodiments may be described using the expression “one embodiment”or “an embodiment” along with their derivatives. These terms mean that aparticular feature, structure, or characteristic described in connectionwith the embodiment is included in at least one embodiment. Theappearances of the phrase “in one embodiment” in various places in thespecification are not necessarily all referring to the same embodiment.Further, some embodiments may be described using the expression“coupled” and “connected” along with their derivatives. These terms arenot necessarily intended as synonyms for each other. For example, someembodiments may be described using the terms “connected” and/or“coupled” to indicate that two or more elements are in direct physicalor electrical contact with each other. The term “coupled,” however, mayalso mean that two or more elements are not in direct contact with eachother, but yet still co-operate or interact with each other.

It is emphasized that the Abstract of the Disclosure is provided toallow a reader to quickly ascertain the nature of the technicaldisclosure. It is submitted with the understanding that it will not beused to interpret or limit the scope or meaning of the claims. Inaddition, in the foregoing Detailed Description, it can be seen thatvarious features are grouped together in a single embodiment for thepurpose of streamlining the disclosure. This method of disclosure is notto be interpreted as reflecting an intention that the claimedembodiments require more features than are expressly recited in eachclaim. Rather, as the following claims reflect, inventive subject matterlies in less than all features of a single disclosed embodiment. Thusthe following claims are hereby incorporated into the DetailedDescription, with each claim standing on its own as a separateembodiment. In the appended claims, the terms “including” and “in which”are used as the plain-English equivalents of the respective terms“comprising” and “wherein,” respectively. Moreover, the terms “first,”“second,” “third,” and so forth, are used merely as labels, and are notintended to impose numerical requirements on their objects.

What has been described above includes examples of the disclosedarchitecture. It is, of course, not possible to describe everyconceivable combination of components and/or methodologies, but one ofordinary skill in the art may recognize that many further combinationsand permutations are possible. Accordingly, the novel architecture isintended to embrace all such alterations, modifications and variationsthat fall within the spirit and scope of the appended claims.

1. An apparatus, comprising: a processing circuit; and logic stored incomputer memory and executed on the processing circuit, the logicoperative to cause the processing circuit to: access configurationinformation associated with migrating source data to a cloud storageservice, the configuration information comprising settings directedtowards storing the source data in the cloud storage service; convertfile data to be migrated to a storage location over a network into amigration dataset in accordance with the settings in the configurationinformation, the storage location corresponding to the cloud storageservice; generate a script for storing the migration dataset in themigration dataset, the script being generated based upon the settings inthe configuration information; and communicate the script and themigration dataset to a server associated with the cloud storage service,the server configured to execute the script and store the migrationdataset in the storage location in accordance with the settings in theconfiguration information.
 2. The apparatus of claim 1, comprising logicoperative to cause the processing circuit to generate an executablepipeline based upon the settings in the configuration information andapply the executable pipeline to convert the file data in response toreceiving the file data over the network.
 3. The apparatus of claim 1,comprising logic operative to cause the processing circuit tocommunicate the migration dataset to a proxy server and instruct theproxy server to execute the script and communicate the migration datasetto the storage location.
 4. The apparatus of claim 1, comprising logicoperative to cause the processing circuit to apply a tokenizationmechanism to a portion of the migration dataset having sensitiveinformation, the portion being identified in the settings in theconfiguration information.
 5. The apparatus of claim 1, comprising logicoperative to cause the processing circuit to generate the script using anetwork address and a data model corresponding to the storage location,the network address and the data model being based upon the settings inthe configuration information.
 6. The apparatus of claim 1, comprisinglogic operative to cause the processing circuit to convert the file dataaccording to an encoding format, the encoding format being based uponthe settings in the configuration information.
 7. The apparatus of claim1 comprising logic operative to cause the processing circuit to accessthe settings comprising information to identify sensitive informationwithin a file.
 8. A computer-implemented method executed on a processingcircuit, the method comprising: accessing configuration informationassociated with migrating source data to a cloud storage service, theconfiguration information comprising settings directed towards storingthe source data in a storage location of the cloud storage service;converting file data into a migration dataset in accordance with thesettings in the configuration information, including applying atokenization mechanism to a portion of the file data having target data,the portion being identified in the settings in the configurationinformation; generating a script for storing the migration dataset inthe storage location, the script being generated based upon the settingsin the configuration information; and communicating the script and themigration dataset to a server managing the cloud storage, the serverconfigured to execute the script and store the migration dataset in thestorage location in accordance with the settings in the configurationinformation.
 9. The computer-implemented method of claim 8, wherein thesettings specify an encryption scheme for encrypting the file data forthe migration dataset.
 10. The computer-implemented method of claim 8,comprising generating an executable pipeline based upon the settings inthe configuration information, the executable pipeline being operativeto convert the file data.
 11. The computer-implemented method of claim10, comprising adding or removing computing modules to the executablepipeline.
 12. The computer-implemented method of claim 8, wherein atoken value to replace the portion having the target data comprises abank account number.
 13. The computer-implemented method of claim 8,comprising generating the script using a network address and a datamodel corresponding to the storage location, the settings in theconfiguration information comprising the network address and the datamodel.
 14. The computer-implemented method of claim 8, comprisingidentifying sensitive information to tokenize within a database record.15. At least one computer-readable storage medium comprisingprocessor-executable instructions that, when executed, cause a systemto: accessing configuration information associated with migrating sourcedata to cloud storage, the configuration information comprising settingsidentifying a storage location and a data model for storing the sourcedata in the cloud storage; in response to file data being migrated tothe storage location over a network, converting the file data into amigration dataset in accordance with the settings in the configurationinformation; apply a tokenization mechanism to a portion of themigration dataset having sensitive information, the portion beingidentified in the settings in the configuration information; generate ascript for storing the migration dataset into the storage location, thescript being generated based upon the settings in the configurationinformation; and communicate the script and the migration dataset to aserver managing the cloud storage, the server configured to execute thescript and store the migration dataset in the storage location inaccordance with the settings in the configuration information.
 16. Thecomputer-readable storage medium of claim 15, comprisingprocessor-executable instructions that when executed, cause the systemto: generate the script in accordance with a data model specified in thesettings in the configuration information.
 17. The computer-readablestorage medium of claim 15, comprising processor-executable instructionsthat when executed, cause the system to: instruct a dedicated proxycomputer to communicate the source data from the storage location. 18.The computer-readable storage medium of claim 15, comprisingprocessor-executable instructions that when executed, cause the systemto: generate the script to load database records into a destinationdatabase.
 19. The computer-readable storage medium of claim 15,comprising processor-executable instructions that when executed, causethe system to: generate an executable pipeline based upon the settingsin the configuration information, the executable pipeline beingoperative to convert the file data.
 20. The computer-readable storagemedium of claim 19, comprising processor-executable instructions thatwhen executed, cause the system to: add computing modules to theexecutable pipeline or remove computing modules to the executablepipeline.